Promoting multiword expressions in A* TAG parsing
نویسندگان
چکیده
Multiword expressions (MWEs) are pervasive in natural languages and often have both idiomatic and compositional readings, which leads to high syntactic ambiguity. We show that for some MWE types idiomatic readings are usually the correct ones. We propose a heuristic for an A? parser for Tree Adjoining Grammars which benefits from this knowledge by promoting MWEoriented analyses. This strategy leads to a substantial reduction in the parsing search space in case of true positive MWE occurrences, while avoiding parsing failures in case of false positives.
منابع مشابه
Multiword Expression-Aware A$*$ TAG Parsing Revisited
A? algorithms enable efficient parsing within the context of large grammars and/or complex syntactic formalisms. Besides, it has been shown that promoting multiword expressions (MWEs) is a beneficial strategy in dealing with syntactic ambiguity. The state-of-the-art A? heuristic for promoting MWEs in tree-adjoining grammar (TAG) parsing has certain drawbacks: it is not monotonic and it composes...
متن کاملDiscriminative Strategies to Integrate Multiword Expression Recognition and Parsing
The integration of multiword expressions in a parsing procedure has been shown to improve accuracy in an artificial context where such expressions have been perfectly pre-identified. This paper evaluates two empirical strategies to integrate multiword units in a real constituency parsing context and shows that the results are not as promising as has sometimes been suggested. Firstly, we show th...
متن کاملCan Recognising Multiword Expressions Improve Shallow Parsing?
There is significant evidence in the literature that integrating knowledge about multiword expressions can improve shallow parsing accuracy. We present an experimental study to quantify this improvement, focusing on compound nominals, proper names and adjectivenoun constructions. The evaluation set of multiword expressions is derived from WordNet and the textual data are downloaded from the web...
متن کاملParsing Models for Identifying Multiword Expressions
Multiword expressions lie at the syntax/semantics interface and have motivated alternative theories of syntax like Construction Grammar. Until now, however, syntactic analysis and multiword expression identification have been modeled separately in natural language processing. We develop two structured prediction models for joint parsing and multiword expression identification. The first is base...
متن کاملTerminology Finite-State Preprocessing for Computational LFG
This paper presents a technique to deal with multiword nominal terminology in a computational Lexical Functional Grammar. This method treats multiword terms as single tokens by modifying the preprocessing stage of the grammar (tokenization and morphological analysis), which consists of a cascade of two-level finite-state automata (transducers). We present here how we build the transducers to ta...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016